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ABSTRACT 

Association rules mining is to find associations efficiently among the different items of a transaction database. In 
order to help decision-makers conduct sound and timely solutions, we propose a two-phase fuzzy data mining method. The 
first phase generates fuzzy weights by managers evaluating each item via fuzzy analytic hierarchy process (FAUP); the 
second phase generates fuzzy association rules by mining frequent items from transaction database. An example is given to 
clearly illustrate the proposed approach. 

KEYWORDS: Association Rule, Fuzzy Weights, Fuzzy Analytic Hierarchy Process (FAHP) 

1. INTRODUCTION 

Data mining is a methodology for the extraction of interesting knowledge from large amounts of data. This 
knowledge may relate to problems to be solved [12]. Thus, data mining can ease the knowledge acquisition bottleneck in 
building prototype systems [2]. If data mining extraction can effectively be applied on all varieties of analysis, it will assist 
the process of decision-making in business such as marketing promotions, inventory management and customer 
relationship management. 

The most widely adopted is to induce association rules (X— >Y) from transaction database, where one existing (X) 
appears, other items (Y) are likely to appear as well [5], For instance, when a customer purchases bread, one might also get 
milk along with it. Accordingly, association rules can assist decision makers to scoop the possible items that are likely to be 
purchased by consumers in hopes to facilitate planning marketing strategies [2] . 

The function of data mining assists making decision to implement business strategies. However, decision making 
is a process to evaluate the acceptance or rejection alternatives based on goals and expectative results. It will choose one or 
best alternatives or actions according to some criteria. ° Thus, multiple criteria decision making (MCDM) has been widely 
used for decision making. In MCDM, analytic hierarchy process (AHP) is well-known by using various files due to the 
simplicity and easy operation of the theory. The method has been generally applied to empirical researches of major 
decision problems in various countries [16]. 

Besides, regarding to the matter of decision making, it takes the consideration of user’s perception and cognitive 
uncertainty of subjective decisions into account. Zadeh proposed fuzzy theory [20] in 1965 that deals with cognitive 
uncertainty of vagueness and ambiguity. Due to linguistic variables and linguistic value, [22-24] were described as fuzzy 
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concepts to correspond with the possible cognition of a decision maker subjectively. It helps precede analysis of decisions. 
Thus, fuzzy data mining has recently become an important topic to delve into. 

The goal of this paper is to propose an effective two-phase mining method. The first phase generates fuzzy 
weights by applying FAHP. Because all the properties of the transaction data may not have the same importance, items 
should take different weights to distinguish significance from other items [16]. The second phase uses fuzzy partition 
method to deal with transaction data, then put into table FGTTFS to mine fuzzy association rules. 

2. REVIEW OF FAHP AND FUZZY PARTITION METHOD 

The analytic hierarchy process (AHP) is proposed by T. L. Saaty in 1971 [13]. It applies uncertainty state and 
possesses multi-evaluative criteria decision makings. The way it builds hierarchical relationship alternative and criterion 
analyses the process of complex problems. When decision-making deals with MCDM to analyze problems in a hierarchical 
structure, it solves multi alternatives. Since the linguistic scale of AHP does not contain fuzzy uncertainty which by 
decision makers make decision, Larrhoven & Pedrycz [17] evolved conventional AHP. They developed Fuzzy Analytic 
Hierarchy Process (FAHP). FAHP is applied by replacing exact values in the conventional AHP with interval values to 
facilitate experts to be able to better evaluation problems with humanizing scale. Because FAHP can reect the problems 
encountered under the decisions made in actual environment, it is often used in various fields [3, 4, 6, 10, 11, 14, 18, and 

19]. 

Next, by dividing each linguistic variable [22-24] with its different linguistic values, X|Xl 2 x...x K d . t fuzzy 
grids with d dimensions in the pattern space can be obtained. In particular, this paper views a fuzzy grid as a fuzzy concept. 
When a linguistic value is not yet determined if it is frequent, it is called a candidate 1-dim fuzzy grid so that a quantitative 

variable Xk can be divided into K partition (K = 2, 3, — ). In addition, A f , stands for a candidate 1-dim fuzzy grid, and 
jU^ , ( x ) can be defined as follows: 

(*) = max{l- \x-af t \ /b k ,0} , (1) 

where af = m, +(m a —m i )(i — 1) /(K — 1) , b k — {m — — 1) , and m a and m, are the maximum and 

minimum values of the domain interval of Xt, /, is the 7-th linguistic value of K linguistic values defined in linguistic 
variables Xt, respectively. Furthermore, candidate 1-dim fuzzy grids can be further employed to generate the other 
candidate or frequent fuzzy grids with higher dimensions. The fuzzy partition method has widely used in pattern 
recognition and fuzzy reasoning [7-9, 21]. 

3. NOTATION AND ALGORITHM 

3.1 Notation 


n : 

the number of item; 

m: 

the number of transaction database; 

h : 

number of items used to describe each transaction data, where 1 <h; 

g- 

number of managers; 

Xk- 

£th item, where 1 <k <h\ 

K: 

the number of linguistic values in each quantitative item of transaction data, where K 
>2; 
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tp\ 

p-th transaction data, where 1 < p < zm; 

a k ■ 

z-th linguistic value of K linguistic values defined in linguistic x k , where l<k<m, 1 
<i<k; 

</,(■): 

the membership function of A^ r ; 


the quantitative value of the item x k for p-th transaction data; 


the fuzzy support of fuzzy grid A* ; for each item x k \ 

d,-- 

the fuzzy weighted value of the item x k ; 

a : 

the user-specified linguistic minimum fuzzy support value; 

/?: 

The user-specified linguistic minimum fuzzy confidence value. 


3.2 The Algorithm 
Fuzzy AHP: 

Input: n evaluative items. 

Output: A set of Fuzzy weights. 

Step 1: Construct fuzzy pair wise comparison matrices. Through g manager questionnaires, each respondent is 
asked to assign linguistic terms by TFN (as shown in Table 1 ) to the pair wise comparison among all items. The results of 

the comparisons is constructed as fuzzy pair wise matrices ( A g ) as shown in Eq. ( 1). 
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Step 2: Examine the consistency of the fuzzy pair wise comparison matrices. According to the research of [1], it 
proves that if A = [aj is a positive reciprocal matrix then A — [a jf ] is a fuzzy positive reciprocal matrix. Namely, if the 

result of the comparisons of A = [a s ,] is consistent, then it can imply that the result of the comparisons of A — [fl s7 ] is also 
consistent. 

Step 3: Compute the fuzzy geometric mean for each item. The geometric techniques is used to calculate the 
geometric mean ( r ) of the fuzzy comparison value of item s to each item, as shown in Eq. (2), where a sn is a fuzzy value 
of the pair wise comparison of item s to each n [ 1 ]. 






( 2 ) 


Table 1: Membership Functions of the Linguistic Scale 


Fuzzy Number 

Linguistic Scales 

TFN(^) 

Reciprocal of a TFN ( a sl ) 

T 

Equally important 

(1,1,3) 

(1/3, 1,1) 

3 

Weakly important 

(1,3,5) 

(1/5, 1/3, 1) 
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5 

Essentially important 

(3, 5, 7) 

(1/7, 1/5, 1/3) 

7 

Very strongly important 

(5, 7, 9) 

(1/9, 1/7, 1/8) 

9 

Absolutely important 

(7, 9, 9) 

(1/9, 1/9, 1/7) 

2, 4, 6, 8 

Intermediate values between two adjacent judgments 


Step 4: Compute the fuzzy weights by normalization. The fuzzy weight of the 5-th item ( w s ), can be derived as 
Eq. (3), where W is denoted as w — (L ,U ) by a TFN and L , M and U represent the lower, middle 

1 v 7 S S'' w v 7 w v 7 7 J w s . 7 w v VP S A 

and upper value of the weight of the 5-th item. 


w. 


/;©•■•© r 


(3) 


Step 5: Defuzzification is a method where fuzzy numbers are transformed to crisp values. The proposed method 
determines best non-fuzzy performance (BNP). This paper adopts Center of Area (COA) defuzzification which is simple 
and convenient in application, and does not need provide extra parameter values [15], makes it a plausible method, as Eq. 
(4). Finally, we use 

d Xt =[(U Wi -L Wi ) + (M Wi -L w J ]/3 + L Ws (4) 

Mining Fuzzy Association Rules-. 


Input: A body of m training data; a set of d i fuzzy weights; each linguistic variable with K linguistic values, a 
user-specified minimum fuzzy support a and a user-specified minimum fuzzy confidence (3. 

Output: A set of fuzzy weighted association rules. 

Step 1: Utilize the fuzzy partition method to transform the quantitative item A f r for each item x k in each 
transaction datum t p (p= 1,2... m ) into a fuzzy grid A ^ , represented as: 


( 





Using the given membership functions jLl^ ; (•) , where A^ ; is the /-th fuzzy grid of K linguistic values defined in 
linguistic variable x k of t p -th transaction data. 

Step 2: Construct a table FGTTFS with the following sub steps to generate frequent fuzzy grids [7], The initial form 
is shown as Table 2. 


• Fuzzy grid table (FG): each row represents a fuzzy grid, and each column represents a linguistic value. 

• Transaction table (TT): each column represents t p , and each element records the membership degree of the 
corresponding fuzzy grid. 
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• Fuzzy support (FS): each column represents the fuzzy support corresponding to each fuzzy grid Af , for each 
item x k in the training data is calculated as Eq (5). 


ra «i)=rl 


m . 


( 5 ) 


Step 3: Generate frequent 1-dim fuzz girds. Set k = 1 and eliminate the rows of the initial FGTTFS that 
correspond to infrequent 1-dim fuzzy grids. 

Step 4: Generate frequent A-dim fuzzy grids (k> 2). Set k: = k+ 1. If there is only one (T-l)-dim fuzzy grid, then 
go to Step 5. For two unpaired grids, A„ and A v (« v), corresponding to (T-l)-dim fuzzy grids to compute (A„, A v ) 

corresponding to a candidate £-dim fuzzy grid A c . 

• Examine the validity of c. If two linguistic values of A„ and A,, belong to the same item, then discard c and skip 
Step 

• Namely, Ac is invalid. 

• Insert (A„, A,,) to FG, ( N'/ ) to TT and FS(A c ) to FS when the fuzzy support of c is larger than or equal to the 
user-specified minimum fuzzy support a; otherwise, discard c. 


N = minttef , ( C/ : ) ■ d Xi ), (4 , (c,:- ) ■ d.„ (,<t , , (q X t ‘r ) • d Xi )] . 


( 6 ) 


The t-norm operator in the fuzzy intersection is the minimum operator. Then, the fuzzy support FS(A c ) of 

m 

frequent grid c is calculated as FS(A c ) = N' p ] / m . 

p = i 

Step 5: Check whether frequent k-dim fuzzy grid is generated. If there exists frequent k-dim fuzzy grids go to 

Step 4. 

Step 6: Construct effective association rules for each frequent gridA c (A cI , A c2 ... A ck ), k>2 as the following 

• List all possible frequent grids: A cI , A c2 ... A ck ^>A ch k = l,2... 1. 

• Calculate the fuzzy weighted confidence values of all association rules using 

y FS (A) 

FC (A c ) = = ^ ^4 • 

X (FS(A cl )-FS(A cl ) FS(A ck )) 

Step 7: Check whether FC (A c ) is larger than or equal to the user-specified minimum fuzzy support (3 and then 
output the fuzzy weighted association rules. 


Table 2: Initial Table FGTTFS 


Fuzzy 

grid 

FG 

TT 

FS 

4,i 

4a 

4,3 

A 3 

^x 2 ,l 

A 3 

a x 2 ,2 

A 3 

a x 2 ,3 

ti 

t2 

4,i 

i 

0 

0 

0 

0 

0 

Arj.lCh, ) 

Ar,,l(G, ) 
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4,2 

0 

1 

0 

0 

0 

0 

Px u 2 (fi, ) 

4,2^2, ) 

m4. 2 ) 

4,3 

0 

0 

1 

0 

0 

0 

4,3 (*ll ) 

4,3 fe,) 

m4. 3 ) 

4,1 

0 

0 

0 

1 

0 

0 

Px 2 ,I (fi 2 ) 

■ u .v 2 ,l( ? 2 2 ) 

FS(A^) 

4,2 

0 

0 

0 

0 

1 

0 

^*2,2 (*1 2 ) 

4,2^2 2 ) 

FS(A^ 2 ) 

4,3 

0 

0 

0 

0 

0 

1 

Px 2 ,3 (h, ) 

4c 2 ,3^2 2 ) 

m44 


4. AN EXAMPLE 

In this section, an example is given to illustrate the proposed a two-phase fuzzy data mining method. The example 
shows how the proposed algorithm can be used to generate fuzzy association rules from a set of transactions. The data set 
includes six transactions in Table 3. Each transaction consists of a transaction identifier and purchased items. There are five 
items A, B, C, D and E to be purchased. Each item is represented by a tuple (item name, item amount). For instance, the 
first transaction consists of six units of A, seven units of B, three units of C, six units of D and four units of E. In addition, 
the assumptions of membership functions for the item quantities are given in Figure 1. 


Table 3: the Data Set used in this Example 


TID 

Items 

1 

(A, 6), (B, 7), (C, 3), (D, 6), (E, 4) 

2 

(A, 5), (B, 7), (D, 11), (E, 6) 

3 

(A, 5), (B, 7), (C, 3), (D, 6) 

4 

(A, 12), (C, 10), (D, 11) 

5 

(A, 8), (B, 9), (C, 7), (E, 7) 

6 

(A, 9), (B, 4), (C, 7), (D, 10) 


Membership 



Figure 1: The Membership Functions are Used in the Example 

In this example, each attribute for xt has three fuzzy grids: Small, Middle and Lai-ge. Thus, three fuzzy 
membership values are produced for the quantity of each item according to the predefined membership functions. 
Additionally, assume that the predefined minimum fuzzy support value and minimum fuzzy confidence value are 0.11 and 
0.58. For the transaction data in Table 3, the two-phase mining method proceeds as follows. 

Fuzzy AH P. 

Step 1: We interviewed three managers to evaluate five items relative importance, the results as shown A 1 , A 2 
and A 3 , respectively. 
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(U,3) 

( 5 , 6 , 7 ) 

( 1 , 3 , 5 ) 

( 1 , 2 , 3 ) 

( 3 , 5 , 7 ) 

( 0 . 143 , 0 . 167 , 0 . 2 ) 

(U,3) 

( 1 , 3 , 5 ) 

( 0 . 143 , 0 . 2 , 0 . 333 ) 

( 1 , 2 , 3 ) 

( 0 . 2 , 0 . 25 , 0 . 333 ) 

( 0 . 2 , 0 . 333 , 1 ) 

(U,3) 

( 0 . 143 , 0 . 167 , 0 . 2 ) 

(U,3) 

( 0 . 333 , 0 . 5 , 1 ) 

( 3 , 5 , 7 ) 

( 5 , 6 , 7 ) 

( 1 , 1 , 3 ) 

( 1 , 3 , 5 ) 

( 0 . 143 , 0 . 2 , 0 . 333 ) 

( 0 . 333 , 0 . 5 , 1 ) 

( 0 . 333 , 1 , 1 ) 

( 0 . 2 , 0 . 333 , 1 ) 

(U,3) 


(U,3) 

(3,4,5) 

(3,5,7) 

(U,3) 

(5,6,7) 


(0.2,0.25,0.333) 

(U,3) 

(1,2,3) 

(0.2,0.25,0.33) 

(1,3,5) 


(0.143,0.2,0.333) 

(0.333,0.5,1) 

(U,3) 

(0.143,0.2,0.333) 

(1,3,5) 

5 

(0.333,1,1) 

(3,4,5) 

(3,5,7) 

(1,1,3) 

(1,3,5) 


(0.143,0.167,0.2) 

(0.2,0.333,1) (0.2,0.333,1) 

(0.2,0.333,1) 

(1,1,3) _ 


(U,3) 

(1,3,5) 

(3,5,7) 

(3,4,5) 

(1,2,3) 

(0.20,0.333,1) 

(U,3) 

(3,5,7) 

(1,2,3) 

(1,2,3) 

(0.143,0.2,0.333) 

(0.143,0.2,0.333) 

(1,1,3) 

(0.143,0.167,0.2) 

(0.2,0.25,0.333) 

(0.2,0.25,0.333) 

(0.333,0.5,1) 

(5,6,7) 

(U,3) 

(1,2,3) 

(0.333,0.5,1) 

(0.333,0.5,1) 

(3,4,5) 

(0.333,0.5,1) 

(1,1,3) 


Step 2: Above three managers, the C.I. of fuzzy pair wise comparison matrices are 0.097457, 0.075712 and 
0.08516, respectively. Hence, three fuzzy pair wise comparison matrices correspond with consistency. 

Step 3: Applying geometric mean to calculation for each item Thus, the results are shown as Table 4. 

Table 4: The TFN of Geometric Mean 


A 

B 

C 

D 

E 

1.000 

1.000 

3.000 

2.466 

4.160 

5.593 

2.080 

4.217 

6.257 

1.442 

2.000 

3.557 

2.466 

3.915 

5.278 

0.179 

0.240 

0.405 

1.000 

1.000 

3.000 

1.442 

3.107 

4.718 

0.306 

0.464 

0.693 

1.000 

2.289 

3.557 

0.160 

0.237 

0.481 

0.212 

0.322 

0.693 

1.000 

1.000 

3.000 

0.143 

0.177 

0.237 

0.585 

0.909 

1.710 

0.281 

0.500 

0.693 

1.442 

2.154 

3.271 

4.217 

5.646 

7.000 

1.00 

1.000 

3.000 

1.000 

2.621 

4.217 

0.189 

0.255 

0.405 

0.281 

0.437 

1.000 

0.585 

1.101 

1.710 

0.237 

0.382 

1.000 

1.000 

1.000 

3.000 


Step 4: Compute the fuzzy weights, the results are shown in Table 5. 
Step 5: Defuzzify the fuzzy weights, the results are shown in Table 6. 


Table 5: The Results of Fuzzy Weights 


A 

B 

C 

D 

E 

0.160 | 0.423 | 1.089 

0.054 | 0.151 | 0.406 

0.028 | 0.066 | 0.199 

0.100 | 0.275 | 0.690 

0.034 | 0.086 | 0.277; 


Table 6: The Results of Defuzzification 


A 

B 

C 

D 

D 

0.558 

0.204 

0.098 

0.355 

0.132 


Mining Fuzzy Association Rules'. 

Step 1: The quantitative values of the items in each transaction are represented by fuzzy sets. Take the first item in 
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Transaction 1 as an example. The amount 6 of A applies the fuzzy partition method to form the fuzzy set (0.167/A. Small + 
0.833/A. Middle) by the given membership function (Figure 1). The other items repeat the step, and the results are shown 
in Table 7. 


Table 7: The Fuzzy Sets Transformed from the Data Set in Table 3 


TID 

Fuzzy sets 

1 

{0 .167/ A. Small+0. 833/ A. Middle), (1.000/5. Middle), (0.667/C. Small + 0.333/C. Middle), 
( 0A67/D.Small+0.833/D . Middle), (0.500/E.Small+0.500/E. Middle) 

2 

(0. 333/A. Small+0. 667 /A. Middle), (1.000/5. Middle), (0A67/D.Middle+0.833/D. Large), 
{0 A67 !E. Small+0. 833/ E. Middle) 

3 

{0.333/ A. Small+0. 667 /A. Middle), (1.000/5. Middle), (0.667/C. Small + 0.333/C. Middle), 
{0A67ID.Small+0.833ID. Middle) 

4 

(0A67/A.Middle+0.833/A. Laige), ( 0.500/C.Middle+0.500/C . Laige), (0.333/Z). Middle+ 
0.667 /D. Large) 

5 

(0.833/A.Middle+0A67/A. Large), (0. 667/5 .Middle+0.333/B. Large), (1.000/C. Middle), 
(1.000 /E. Middle) 

6 

(0 . 667 /A .Middle+0 .333 /A. Laige), ( 0.500/B.Small+0.500/B . Middle), (1.000/C. Middle), 
(0.500/D.Middle+0.500/D. Laige) 


Step 2: Construct a table FGTTFS and calculate each fuzzy grid : for each item Xk, and the results are shown 


in Table 8 


Table 8: Initial Table FGTTFS 


Fuzzy grid 

FG 

TT 

FS 

A. Small 

A. 

Middle 

... 

E. 

Middle 

E. Large 

tl 

t2 

... 

t6 

A. Small 

1 

0 


0 

0 

0.093 

0.186 


0.000 

0.078 

A. Middle 

0 

1 


0 

0 

0.465 

0.372 


0.372 

0.357 












E. Middle 

0 

0 


1 

0 

0.066 

0.110 


0.000 

0.051 

E. Laige 

0 

0 


0 

1 

0.000 

0.000 


0.000 

0.000 


Step 3: Check whether each fuzzy support of each grid is larger than or equal to the 0.11. Then generate frequent 
1-dim fuzzy grids are A. Middle, A. Lai-ge, B. Middle and D. Middle. 

Step 4: Set kr. = k + 1. Thus 1-dim fuzzy grid must be joined to compute their FS value to generate frequent (k-1 )- 
dim fuzzy grids. The results are shown as Table 9. 

In table 9, two fuzzy grids, (A. Middle x D. Middle) and (A. Middle x B. Middle), are considered as frequent 
because their values are larger than or equal to the 0.11. 

Step 5: Two frequent grids are generated, do the following steps. 

Step 6: Construct effective association rules for each frequent grid. 

• List all frequent grids. The possible four association rules are as the following: 

A. Middle then D. Middle; D. Middle then A. Middle; 

A. Middle then B. Middle; B. Middle then A. Middle. 

• Calculate fuzzy weighted values for the above rules. Take the second rule as an example. 
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FC (D. Middle —> A. Middle) = = 0.970 . 

0.168 

The weighed confidence values for the other fuzzy association rules can be calculated as the same way. 

Table 9: The Table FGTTFS of (k-l)-Dim 


Fuzzy 

Grid 

FG 

XT 

FS 

A. Middle 

A. Large 

B. Middle 

D. Middle 

tl 

t2 

... 

t6 

(A. Middle, 
D. Middle) 

1 

0 

0 

1 

0.296 

0.118 


0.178 

0.163 

(A. Laige, 
D. Middle) 

0 

1 

0 

1 

0.000 

0.000 


0.178 

0.049 

(A. Middle, 
B. Middle) 

1 

0 

1 

0 

0.204 

0.204 


0.102 

0.142 

(A. Laige, 
B. Middle) 

0 

1 

1 

0 

0.000 

0.000 


0.102 

0.033 

(B. Middle, 
D. Middle) 

0 

0 

1 

1 

0.204 

0.118 


0.102 

0.105 


Step 7: Since the fuzzy weighted confidence value of ( D . Middle, A. Middle) is 0.970, which is larger than 0.58, 
this weighted frequent grid is an effective rule. In fact, the following two weighted frequent grids can form association 
rules as an output. 

• ( D . Middle, A. Middle) with fuzzy confidence is 0.970: If a middle number of D is bought then a middle of A is 
bought with a confidence of 0.970. 

• ( B . Middle, A. Middle) with fuzzy confidence is 1.000: If a middle number of B is bought then a middle of A is 
bought with a confidence of 1 .000. 

5. CONCLUSIONS 

In this paper, we have proposed a two-phase fuzzy data mining method, which combines FA1TP and table 
FGTTFS, to optimally generate fuzzy weighted association rules. Since the useful fuzzy concepts can be linguistically 
interpreted, decision makers can easily acquire fuzzy knowledge by useful fuzzy concepts. Future research will attempt to 
design different fuzzy data mining model dealing with various problem domains. 
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